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ABSTRACT 

Splicing of U12-de pen dent introns requires the function of U11, U12, U6atac, U4atac, and U5 snRNAs. Recent studies 
have suggested that U6atac and U12 snRNAs interact extensively with each other, as well as with the pre-mRNA by 
Watson-Crick base pairing. The overall structure and many of the sequences are very similar to the highly conserved 
analogous regions of U6 and U2 snRNAs. We have identified the homologs of U6atac and U12 snRNAs in the plant 
Arabidopsis thaliana. These snRNAs are significantly diverged from human, showing overall identities of 65% for 
U6atac and 55% for U12 snRNA. However, there is almost complete conservation of the sequences and structures that 
are implicated in splicing. The sequence of plant U6atac snRNA shows complete conservation of the nucleotides that 
base pair to the 5' splice site sequences of U12-dependent introns in human. The immediately adjacent AGAGA 
sequence, which is found in human U6atac and all U6 snRNAs, is also conserved. High conservation is also observed 
in the sequences of U6atac and U12 that are believed to base pair with each other. The intramolecular U6atac 
stem-loop structure immediately adjacent to the U12 interaction region differs from the human sequence in 9 out of 
21 positions. Most of these differences are in base pairing regions with compensatory changes occurring across th 
stem. To show that this stem-loop was functional, it was transplanted into a human suppressor U6atac snRNA 
expression construct. This chimeric snRNA was inactive in vivo but could be rescued by coexpression of a U4atac 
snRNA expression construct containing compensatory mutations that restored base pairing to the chimeric U6atac 
snRNA. These data show that base pairing of U4atac snRNA to U6atac snRNA has a required role in vivo and that the 
plant U6atac intramolecular stem-loop is the functional analog of the human sequence. 
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INTRODUCTION 

The presence of multiple introns in most nuclear mRNA 
coding genes is a distinctive feature of the genomes of 
animals and higher plants. The consensus features of 
splice sites in these two groups of organisms are very 
similar, although there may be some differences in the 
mechanism of splice site definition (Wiebauer et al., 
1988; Simpson & Filipowicz, 1996; Brown & Simpson, 
1998). Even more striking is the conservation of the 
sequences and structures of the small nuclear RNAs 
that are involved in spliceosome structure and function 
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(Brow & Guthrie, 1988; Guthrie & Patterson, 1988; 
Reddy & Busch, 1988). The most conserved regions of 
these snRNAs are the portions of U6 and U2 that are 
believed to comprise the central set of RNA-RNA in- 
teractions in the spliceosome (Nilsen, 1998). In these 
regions, the sequences of human and plant snRNAs 
are almost identical. Although this conservation of se- 
quences over the billion years of evolution that sepa- 
rate these taxonomic groups testifies to their important 
function in splicing, the lack of variation makes it diffi- 
cult to use phylogenetic covariation to substantiate po- 
tential RNA-RNA interactions. 

The recent identification of a minor class of nuclear 
pre-mRNA introns that are spliced by a distinct alter- 
native spliceosome has provided an unexpected ex- 
ample in which to evaluate the present models of RNA 
interactions in the spliceosome (reviewed in Tarn & 
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Steitz, 1997; Burge et al., 1998b; Nilsen, 1998). These 
introns were first identified exclusively in animals (Jack- 
son, 1991; Hall & Padgett, 1994) and the splicing path- 
way and snRNA components were initially identified in 
extracts of human cells (Tarn & Steitz, 1996a, 1996b). 
Subsequently, it was discovered that introns with iden- 
tical consensus splice site features are present in plants 
(Wu et al., 1996). This suggested that the origins of this 
class predated the divergence of the plant and animal 
kingdoms. 

The snRNAs that are involved in splicing this minor 
(U12-dependent) class of introns in human cells have 
been shown to be functional analogs of the major (In- 
dependent) class spliceosomal snRNAs. U11 snRNA 
appears to be the functional analog of U1 snRNA, U12 
snRNA is the analog of U2 snRNA, U4atac snRNA is 
the analog of U4 snRNA and U6atac snRNA is the 
analog of U6 snRNA. U5 snRNA appears to function in 
both spliceosomes (reviewed in Tarn & Steitz, 1997). 
The functional similarities of the two sets of snRNAs 
are given added support by the apparent conservation 
of RNA-RNA interactions between the pre-mRNA splice 
sites, U12 snRNA, U6atac snRNA, and U11 snRNA 
(see Tarn & Steitz, 1997; Burge et al., 1998b; Nilsen, 
1998). 

As part of an effort to understand the structure/ 
function relationships in these newly identified snRNAs, 
we wanted to carry out a phylogenetic comparison 
particularly of U12 and U6atac snRNAs because they 
are thought to be most central to the splicing reac- 
tions. U12-dependent introns have been found in fish, 
amphibians, birds, mammals, insects, jellyfish, and 
higher plants (Burge et al., 1998a). However, In- 
dependent introns do not appear to exist in the yeast 
Saccharomyces cerevisiae or the nematode Caeno- 
rhabditis elegans based on analysis of the complete 
or nearly complete genome sequences, respectively 
(Burge et al., 1998a). Thus, the animal-plant diver- 
gence represents the deepest branching known of the 
U12-dependent introns. 

A recent genomic sequence database search identi- 
fied 11 probable U12-dependent introns out of 19,553 
introns in the plant Arabidopsis thaliana (Burge et al., 
1998a). Remarkably, in almost every instance, the splice 
site sequences of the plant U12-dependent introns are 
identical to the animal sequences in spite of the evo- 
lutionary distance. The existence of these introns in 
plants implies the parallel existence of a set of snRNAs 
to splice them. Here we report on the identification of 
snRNAs from Arabidopsis that appear to be the plant 
homologs of U6atac and U12 snRNAs. Surprisingly, 
the sequence conservation of these snRNAs between 
humans and plants is much less than that seen in the 
major class snRNAs. This allows us to use phylo- 
genetic covariation to investigate the RNA-RNA in- 
teractions that have been proposed to occur in the 
U12-dependent spliceosome. 



RESULTS 

Identification of a putative plant 
U6atac snRNA homolog 

A search of plant genomic sequences for similarities to 
the snRNAs of the human U12-dependent spliceosome 
revealed a provocative match to the sequence of hu- 
man U6atac snRNA. This sequence, shown in Fig- 
ure 1 , appears to encode an snRNA of similar length to 
human U6atac with about 65% sequence identity. The 
sequence similarity is highest in the 5' portion of the 
putative snRNA where the regions of interaction with 
U1 2 snRNA and the pre-mRNA5' splice site have been 
localized. 

The sequence of the flanking genomic region of this 
sequence contains elements that would appear to sup- 
port transcription of this snRNA in plant cells based on 
their similarity to elements flanking active U6 snRNA 
genes (Waibel & Filipowicz, 1990). An upstream se- 
quence element (USE) with the sequence GTCCCA 
CATCG occurs at position -67 to -57 upstream of the 
putative transcription initiation site in Arabidopsis U6atac 
snRNA. This sequence is identical to USE sequences 
found in the U6-1 and U6-26 snRNA genes of Arabi- 
dopsis at position -66 to -56 from the transcription 
initiation sites (Waibel & Filipowicz, 1990). A second 
conserved USE is a TATA-like box with the sequence 
TATATATA at position -32 to -25 in the Arabidopsis 
U6atac snRNA gene. A similar element with the se- 
quence TTTATATA at position -31 to -24 is found in 
the Arabidopsis U6-1, U6-26, and U6-29 snRNA genes. 
The final USE is a cap-adjacent sequence, GATT, lo- 
cated between -4 and -1, which is conserved in Ara- 
bidopsis U6atac snRNA and all three U6 snRNA genes 
of Arabidopsis, All three Arabidopsis U6 snRNA genes 
have been shown to be transcriptionally active and re- 
sistant to a-amanitin (Waibel & Filipowicz, 1990). These 
similarities suggest that this Arabidopsis U6atac gene 
is transcriptionally active and is probably transcribed 
by RNA polymerase III similar to the U6 snRNA genes. 

To verify that this RNA was expressed in cells, a 
sample of total RNA from Arabidopsis thaliana was re- 
verse transcribed using a primer derived from the ge- 
nomic sequence described above. The resulting cDNA 
was amplified, by PCR using the same primer and a 
primer from the predicted 5' end of the snRNA. A PCR 
fragment of the expected size was amplified and cloned 
that matched the genomic sequence. Amplification re- 
actions carried out without reverse transcription failed 
to yield a DNA product, confirming that the RT-PCR 
product was not derived from contaminating DNA (data 
not shown). As a further test of the expression of this 
snRNA, a sample of total RNA was separated on a 
denaturing polyacrylamide gel, blotted to nylon mem- 
brane, and probed with the labeled PCR fragment. The 
probe hybridized to a single RNA of about 125 nt 
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FIGURE 1. Sequence of the Arabidopsis U6atac snRNAgene. The conserved promoter elements in the -80 to -1 region 
discussed in the text are enclosed in open boxes. The human U6atac snRNA sequence is shown beneath the Arabidopsis 
sequence and conserved nucleotide positions are enclosed in shaded boxes. The numbering for human U6atac is used for 
the snRNA sequences. Gaps introduced into the alignment are shown as hyphens (-). The sequence is from GenBank 
Accession #AB006702. 



(Fig. 2). The sequence of the 5' and 3' ends of the 
U6atac snRNA were subsequently determined by 
sequencing of cloned 5' and 3' RACE products pre- 
pared as described in Materials and Methods. The 
sequences of these clones matched the genomic se- 
quence at the 5' and 3' ends. The length of the RNA 
determined by the RACE procedure is 123 nt. Although 
this is longer than the U6 snRNAs of human (1 07 nt) or 
Arabidopsis (102 nt), it is similar to the length of human 
U6atac snRNA (125 nt). These results establish that 
the genomic sequence is expressed as a small RNA in 
Arabidopsis. 

The putative plant U6atac snRNA homolog can be 
folded into a hypothetical secondary structure similar 
to that proposed for human U6atac snRNA (Fig. 3). 
The 5' stem-loop structure of the plant RNA is similar 
in size and position, but appears to be somewhat 
less stable than that of the human structure. In both 
RNAs, the critical nucleotides that base pair with the 
U12-dependent intron 5' splice site are located in the 



5' loop. The middle stem-loop of the plant sequence 
appears to be stronger than in human U6atac. These 
nucleotides, however, are probably base paired with 
U4atac snRNA in the di-snRNP particle rather than in 
the structure shown here for the isolated U6atac 
snRNA (Tarn & Steitz, 1996b). Other foldings can be 
generated for both the plant and human sequences. 
In the absence of physical information, the structures 
shown here are provisional at best. 

Identification of a putative plant 
U12 snRNA homolog 

The initial database searches of available plant geno- 
mic sequences did not reveal a candidate with a sig- 
nificant match to human U12 snRNA. Based on both 
the conservation of the branch site sequence in plant 
U1 2-dependent introns and the conservation in the plant 
U6atac snRNA identified above of the putative region 
of interaction with U12 snRNA, we probed a blot of 
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FIGURE 2. Expression of the Arabidopsis U6atac snRNA shown by 
Northern blot analysis of total RNA. Ten micrograms of total RNA 
from Arabidopsis was fractionated on a 8% denaturing polyacryl- 
amide gel and transferred to a nylon membrane. The blot was hy- 
bridized with 32 P-labeled U6atac snRNA cDNA. The sizes of the 
32 P-labeled in vitro transcribed RNA markers (Ambion) in nucleotides 
are given at the left. 



Arabidopsis RNA with an oligonucleotide complemen- 
tary to the branch site and U6atac interaction regions 
of human U12 snRNA. This probe hybridized to an 
RNA of approximately 175 nt (data not shown) sug- 



gesting the existence of a potential U12 snRNA homo- 
log in Arabidopsis. 

To determine the sequence of this putative plant U12 
snRNA homolog, we employed 3' RACE on Arabidop- 
sis size-fractionated total cell RNA in the 100-200 nt 
range. This RNA was 3' polyadenylated using yeast 
poly (A) polymerase and then reverse transcribed using 
an oligo d(T) primer. The cDNA was amplified using an 
oligonucleotide containing nt 1-24 of human U12 snRNA 
and the oligo d(T) primer. This yielded a DNA fragment 
of approximately 175 bp, which was cloned and se- 
quenced. To obtain the sequence of the 5' end of the 
RNA, a 5' RACE procedure was employed using an 
internal primer (see Materials and Methods). The com- 
plete sequence of the RNA is shown in Figure 4, aligned 
with known U12 snRNA homoiogs (Tarn et al., 1995; 
Yu et aL, 1996). Hybridization to a blot of total Arabi- 
dopsis RNA using a probe from the complete snRNA 
identified a single RNA of about 175 nt in agreement 
with the size of the predicted RNA from the RACE ex- 
periments (Fig. 5). 

The alignment in Figure 4 shows that this RNA is 
highly similar to other U12 snRNAs and thus supports 
the conclusion that this is the plant U12 snRNA. The 
plant RNA shows 52-59% sequence identity to the other 
U12 snRNAs. The similarity is strongest in the 5' end 
where the regions that interact with the intron branch 
site and with U6atac snRNA are located. Outside of 
this region, the extent of similarity was lower. The plant 
RNA contains an Sm protein binding site located at a 
similar position with a single T-to-A deviation from the 
consensus sequence (Fig. 6). Most of the size differ- 



B 



U C @^^G 



A 



u u 

c c 

C — G 



Q A — U 

A — 20 C — G 



U G (§) 

® °. 0 A A C— G 

10 — G U C G .. C U — 110 

A G 90— U— A U q 

U G U C U G * U G C 

G U C C U — A A A— U 

C U G U U ■ G U— A 

U • G A A A c _ c 70 u_ A 

U • G U-A-110 u- A | G-C 

Q — c 40 7 9 U A 5( _ mpp p G — CUCCCCU — AC AC GC A U AC G — CAUCG 

U-A | | 80-G-C | u- A | j 

5'— m pppG • UCUCCUCUGA C— GAACACAUCC G— CCAAUUUUUUU— 3' ^ Q _ c w ^ 

I A— U | A A 

30 o . U 120 40— C— G 

A— U A" U 

G— C a 

A— U G — C 

C-0°G A G-C Q 

G— C G — C 

G— c A — U 

50 — A A 50— A C 

G C 

» 0 A C 

U U G G 



FIGURE 3. Proposed secondary structures of U6atac snRNAs. The nucleotides that can base pair to the intron 5' splice 
site are circled. A: Arabidopsis U6atac snRNA. B: Human U6atac snRNA (adapted from Tarn & Steitz, 1996b). 
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FIGURE 4. Sequence comparison of the putative Arabidopsis U12 snRNA with human, chicken, mouse, and Xenopus U12 
snRNAs (Tarn et al., 1995; Yu et al., 1996). The nucleotide numbering is for the Arabidopsis sequence. Conserved 
nucleotides are boxed and gaps introduced into the alignment are shown as hyphens (-). 



ence between the plant and human U12 snRNAs is 
because of an additional sequence of seven internal 
and eight 3' nucleotides in the plant RNA. 

This putative plant U1 2 snRNA homolog can be folded 
into a hypothetical secondary structure very similar to 
that determined for human U1 2 snRNA (Montzka Was- 



sarman & Steitz, 1992; Fig. 6). In this structure, the 7-nt 
additional internal sequence in the plant homolog is 
placed in the loop region of stem-loop III. In addition, 
many of the differences between the plant and human 
sequences appear to be due to compensatory changes 
that maintain base pairs in the stem regions. 
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FIGURE 5. Expression of the Arabidopsis U12 snRNA shown by 
Northern blot analysis of total RNA. Ten micrograms of total RNA 
from Arabidopsis was fractionated on a 8% denaturing polyacryl- 
amide gel and transferred to a nylon membrane. The blot was hy- 
bridized with 32 P-labeled U12 snRNA cDNA. The sizes of the RNA 
markers in nucleotides are given at the left. 



Conservation of proposed interactions in the 
U12-dependent spliceosome 

A major goal of this work was to use phylogenetic 
comparisons between the plant and human snRNA 
sequences to evaluate the RNA-RNA interactions pro- 
posed for the U12-dependent spliceosome. Tarn and 
Steitz (1996b) have proposed a model of the RNA 
interactions between U12 and U6atac based on anal- 
ogy with the U2-U6 snRNA interactions and the dem- 
onstration of a U12-U6atac crosslink. Figure 7 shows 
a modified version of this model that, in addition to 
the U12-U6atac pairing, also shows the base pairing 
of U12 snRNA to the intron branch site sequence 
and the pairing of U11 and U6atac snRNAs to the 
intron 5' splice site sequence as established in mam- 
malian U12-dependent splicing (Hall & Padgett, 1996; 
Tarn & Steitz, 1996a, 1996b; Yu & Steitz, 1997; In- 
corvaia & Padgett, 1998). The nucleotides that differ 
from the human sequences are shown in bold type. 
The analogous human structure is shown in Fig- 
ure 8A. Since the regions of U12 and U6atac snRNAs 
that are involved in these potential interactions are 
the most conserved regions of both snRNAs, it is not 
surprising that the human and plant structures would 
be very similar. As expected from the conservation of 
the intronic splice site sequences, the regions of plant 
U12 and U6atac snRNAs that interact with the splice 
sites are identical to the human sequences. 



Also identical in plant U6atac snRNA is the AGAGA 
sequence immediately following the 5' splice site pair- 
ing region. This sequence is also completely con- 
served in U6 snRNAs (Brow & Guthrie, 1988) and is 
required for function in yeast U6 snRNA (Fabrizio & 
Abelson, 1990; Madhani et al., 1990). Immediately fol- 
lowing this sequence is the region believed to base pair 
to U12 snRNA to form the two-part helix I interaction 
(Tarn & Steitz, 1 996b). The plant U6atac snRNA differs 
from the human sequence in two positions in this re- 
gion. There is a U in place of an A at position 21 that 
appears to be compensated for by an A-to-U change in 
U12 snRNA. These changes maintain the potential A-U 
base pair as shown. The second difference is an A-to-G 
change in helix lb at position 26. This alters the AGC 
sequence in this region to GGC and allows an addi- 
tional G-C base pair to be made to U12 snRNA. This 
would also have the effect of enlarging the bulge in 
U12 snRNA between helix la and helix lb to 3 nt. In U6 
snRNA, alterations to the highly conserved AGC se- 
quence at the analogous position have severe effects 
on splicing, with mutations of the A residue leading to a 
block at the second step of splicing in yeast (Fabrizio & 
Abelson, 1990; Madhani et al., 1990). The role of this 
change in U6atac snRNA is addressed below. 

Immediately following this AGC in human U6atac 
snRNA is an intramolecular stem-loop structure which 
extends helix lb. This stem-loop is very similar in size 
and structure to a conserved stem-loop structure lo- 
cated at the analogous position in U6 snRNA. The plant 
U6atac homolog conserves the structural features of 
this stem-loop but has altered bases at 9 of the 21 
positions compared to human U6atac. Investigations of 
this structure in human U6 snRNA indicate that, al- 
though the base pairs are important for function, the 
identities of the bases are not (Sun & Manley, 1997). As 
is shown in Figure 7, nine of the ten changes can be 
accommodated in the same base-paired structure as in 
the human homolog (Fig. 8A). The sole exception is the 
change of the bulged U 46 residue to a C. The structure 
shown in Figure 7 is slightly different from the original 
proposal for this region of human U6atac snRNA (Tarn 
& Steitz, 1996b). The pattern of compensatory base 
changes between the human and plant sequences sug- 
gests that both A 45 and C (plant) or U (human) 46 are 
bulged rather than having a single bulged residue at 
position 46. This change does not alter the upper part 
of the stem, but does rearrange the base pairs in the 
lower section, resulting in fewer noncanonical base pairs, 
and increases the calculated stabilities of both struc- 
tures by 2-3 kcal/mol. The actual structure that this 
region adopts in the spliceosome is likely to be influ- 
enced by additional tertiary RNA interactions and in- 
teractions with proteins. 

As in the case of human U12 and U6atac snRNAs 
(Tarn & Steitz, 1996b), an interaction analogous to the 
U2-U6 helix II (Datta & Weiner, 1991) cannot be made. 
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FIGURE 6. Proposed secondary structures of U12 snRNAs. The nucleotides that can base pair to the intron branch site are 
in outline type. The probable Sm protein binding sites are shaded. A: Arabidopsis U12 snRNA. B: Human U12 snRNA 
(adapted from Tarn et al., 1995). 



In addition, the potential for base pairs between U12 
and U6atac near the 5' end of U6atac (Tarn & Steitz, 
1996b) that resemble the U2-U6 helix III (Sun & Man- 
ley, 1 995) is not conserved in the plant homologs (com- 
pare Figs. 7 and 8A). 



The U6atac snRNA intramolecular stem-loop 
structure can be functionally transplanted 

One of the intriguing conservations observed when com- 
paring U6 and U6atac snRNAs is in the structure of the 
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intramolecular stem-loop that immediately follows the 
region of interaction with U2 or U12 respectively. This 
feature of U6 snRNA is required for splicing, but ap- 
pears to be tolerant of mutations that maintain the struc- 
ture (Sun & Manley, 1997). As noted above, the plant 
and human U6atac snRNAs differ at 9 positions out of 
21 within this region yet have a very similar predicted 
structure. We wanted to test the hypothesis that both 
stem-loops were also functionally homologous. To do 
this, we introduced all nine mutations into the stem- 
loop region of a previously described human U6atac 
suppressor snRNA expression construct (Incorvaia & 
Padgett, 1998). The parent construct contains a double 
mutation in the region that base pairs with the 5' splice 
site that suppresses a 5' splice site mutation when 
coexpressed with a similarly altered U11 snRNA con- 
struct (Incorvaia & Padgett, 1998). This suppression 
assay allows us to determine the in vivo functional ef- 
fects of second site mutations of U6atac in the pres- 
ence of the endogenous wild-type U6atac snRNA. 

The specific mutations are shown in Figure 8A (ex- 
cept for the A26G mutation which was tested sepa- 
rately; see below) and the results of the in vivo splicing 
suppression assay are shown in Figure 9. The results 
show that, as we previously demonstrated (Incorvaia & 
Padgett, 1998), the cryptic splicing phenotype of the 



P120 CC5/6GG 5' splice site mutant is suppressed to 
yield correctly spliced mRNA when U11 and U6atac 
snRNAs containing compensatory mutations in their 5' 
splice site interaction regions are coexpressed (com- 
pare Fig. 9, lanes 4 and 9). However, when the human 
U6atac stem-loop is replaced by the plant stem-loop in 
this construct, no suppression is observed (Fig. 9, 
lane 1 1 ) suggesting that this chimeric U6atac snRNA is 
nonfunctional. 

This result could be due to any of several reasons. 
For example, the chimeric U6atac snRNA might be un- 
stable, it might not interact productively with human 
U4atac snRNA, or it might not interact with the other 
elements of the human U12-dependent spliceosome. 
Of these possibilities, a problem with the interaction 
with U4atac could be tested in vivo. As shown in Fig- 
ure 8B, the region of U6atac snRNA that encompasses 
the intramolecular stem-loop is also a region that base 
pairs extensively with U4atac snRNA. The mutations 
introduced by transplanting the plant stem-loop se- 
quence significantly reduced the base pairing potential 
of the chimeric U6atac with wild-type U4atac snRNA. 

To compensate for this, we prepared an expression 
construct for U4atac snRNA as we had for U11 and 
U12 snRNAs (Hall & Padgett, 1996; Kolossova & 
Padgett, 1997) by replacing the snRNA coding region 
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FIGURE 8. A: Mutational analysis of U6atac snRNA function. The P120 5' splice site mutation CC5/6GG is shown above 
the figure with the compensating mutations in U11 and U6atac snRNAs. The base alterations tested in the in vivo mutant 
suppressor assay are shown. The chimeric U6atac snRNA construct contained the mutations shown from C30 through A49. 
The A26G mutation (boxed) was tested separately. B: Diagram of the proposed base-pairing interaction between human 
U4atac and U6atac. The arrows on the U6atac sequence show the changes that were made in this region by constructing 
the chimeric U6atac snRNA using the Arabidopsis intramolecular stem-loop structure. The arrows on the U4atac sequence 
show the changes made in the human U4atac expression construct to compensate for the lost base pairs in the U6atac 
chimera. 
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FIGURE 9. In vivo functional analysis of mutant and chimeric U6atac snRNAs. The indicated P120 minigene constructs 
were transfected into CHO cells along with expression constructs for the various snRNAs. RNA extracted after 48 h was 
analyzed by RT-PCR for the splicing pattern of the P1 20 minigene. The three major products correspond to unspliced intron 
F RNA (Unspliced), RNA spliced via the U2-dependent spliceosome using internal 5' and 3' cryptic splice sites (Cryptic) and 
correctly spliced P1 20 RNA (Spliced). Lane 1 is mock transfected CHO cell RNA. Lane 2 is RNA from cells transfected with 
the pCB6 expression vector. Lane 3 is RNA from cells transfected with the wild-type P120 expression construct. Lanes 4-16 
are RNA samples from cells transfected with the intron F 5' splice site mutant CC5/6GG and the indicated snRNA 
construct(s). 



of a functional human U1 snRNA gene with the U4atac 
snRNA sequence. In addition, we prepared a mutant 
U4atac snRNA expression construct that contained 
compensating mutations to restore base pairing to the 
chimeric U6atac snRNA. We then tested the combi- 
nation of the mutant U4atac with the chimeric U6atac 
snRNAs in the in vivo splicing suppression assay. 
As shown in Figure 9, the mutant U4atac could acti- 
vate the suppression activity of the chimeric U6atac 
snRNA (lane 12). The mutant U4atac by itself (Fig. 9, 
lane 15) or with the suppressor U11 snRNA (Fig. 9, 
lane 13) was inactive for in vivo suppression. 

These results show (1) that the U6atac intramolecu- 
lar stem-loop can retain function in spite of the alter- 
ation of close to half of its residues; (2) that its function 
in splicing is conserved between plants and humans; 
(3) that this region of U6atac must also participate in 
base pairing interactions with U4atac snRNA for ex- 
pression and/or function; and (4) that U4atac plays an 
essential role in U12-dependent pre-mRNA splicing 
through its interaction with U6atac snRNA. 



The lack of conservation of A 26 does not 
affect function of U6atac 

As noted above, the plant U6atac snRNA differs from hu- 
man U6atac at position 26 in the helix lb region. The se- 



quence of the analogous region of U6 snRNA contains 
a highly conserved AGC motif which is also found inhu- 
man U6atac snRNA. The homologous sequence in plant 
U6atac snRNA is GGC. Mutation of this region in yeast 
U6 snRNA leads to defects in splicing (Fabrizio & Abel- 
son, 1990; Madhani et al., 1990). In particular, mutants 
of the A residue show a phenotype consistent with a sec- 
ond step defect. In light of this, the possibility existed that 
the plant U6atac sequence that we identified could be 
an inactive isoform although we had no evidence for a 
second species. 

To rule out this possibility, we tested the effect of the 
A 26 G mutation in human U6atac snRNA using the 
same method as above. We constructed a U6atac 
snRNA expression vector containing the GG14/15CC 
mutation and the A 26 G mutation and tested it for 
activity in suppression of the P120 5' splice site CC5/ 
6GG mutant. As shown in Figure 9, lanes 7 and 10, the 
position 26 mutation had no effect on the suppression 
activity of U6atac snRNA. These results show that this 
mutation at position 26 is fully compatible with U12- 
dependent splicing in vivo. 

DISCUSSION 

From the time of their first identification, the virtually 
complete conservation of the splice site and branch 
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site sequences of U12-dependent introns has been a 
distinctive feature of this group of introns (Hall & Padgett, 
1996). The evolutionary distance over which this con- 
servation holds was significantly extended by the rec- 
ognition of U12-dependent introns in plants (Wu et al., 
1996). Subsequently, many more U12-dependent in- 
trons have been identified in plant genomic sequences 
(Burge et al., 1 998a). The splice site sequences of all of 
these plant introns fall in the same range of consensus 
scores as animal U12-dependent introns suggesting 
that there are no plant-specific modifications of the 
consensus. This, in turn, suggested that the snRNAs 
that recognize these splice site sequences would likely 
show similar conservation of the interacting regions. 
This conservation should allow us to identify the plant 
homologs of the human U12-dependent spliceosomal 
snRNAs. We therefore used database searches and 
biochemical approaches to identify the plant U12 and 
U6atac snRNA homologs based on the conservation of 
the sequences that interact with splice sites. 

The putative U6atac snRNA homolog that we identi- 
fied in Arabidopsis is likely to be the authentic U6atac 
snRNA based on several lines of evidence. We have 
shown that the sequence we identified in the genome 
database is expressed in plant cells as a small RNA 
with ends corresponding to those predicted by compar- 
ison to human U6atac snRNA. In addition, this gene se- 
quence contains all the elements known to be required 
to promote proper transcription of U6-like genes in plants 
(Waibel & Filipowicz, 1990). The sequence of the ex- 
pressed RNA shows complete conservation of the 5' 
splice site interacting region in agreement with the con- 
servation of the 5' splice site sequence between plants 
and humans (Burge et aL, 1 998a). There is almost com- 
plete identity between the plant and human sequence 
in the region between the splice site pairing region and 
the intramolecular stem-loop. This region encompasses 
both the AGAGA sequence, which is invariant in U6 
snRNA (Brow & Guthrie, 1988) and also found in hu- 
man U6atac snRNA (Tarn & Steitz, 1996b), as well as 
the region of potential base pairing to U12 snRNA. Of 
the 2 nt that differ in this region between plants and hu- 
mans, the A-to-U change at position 21 appears to be 
compensated by a U-to-A change at position 14 of U12 
snRNA, thus maintaining the base pair at this position. 
Interestingly, this same change is seen in Xenopus U1 2 
snRNA (Yu et al., 1996). Whether a similar compen- 
satory change exists in Xenopus U6atac snRNA is not 
known. The second change in Arabidopsis U6atac 
snRNA of G for A at position 26 does not appear to af- 
fect U6atac function (see below). Finally, the intramolec- 
ular stem-loop that immediately follows this region is only 
about 50% conserved in sequence but can be folded into 
a similar structure and is fully functional when trans- 
planted into human U6atac snRNA (see below). All of 
these findings support the identification of this RNA as 
the plant U6atac snRNA homolog. 



A similar set of arguments applies to the putative 
plant U12 snRNA homolog identified here. The se- 
quence is expressed as a small RNA in plants. The size 
and predicted structure are similar to those of verte- 
brate U1 2 snRNAs. The region that is predicted to base 
pair to the intron branch site sequence is identical to 
human, as is the branch site sequence itself. The re- 
gion predicted to base pair to U6atac snRNA is also 
very highly conserved, with only one compensatory 
base change that maintains pairing to the plant U6atac 
homolog. Finally, this RNA appears to be the only plant 
small RNA that contains these conserved elements. 
Both Northern and RACE analyses failed to detect other 
RNAs using the 5' end of human U12 snRNA. A recent 
addition to the Arabidopsis genome database con- 
tained a sequence (Accession #AC004255) with high 
similarity to U12 snRNA of human and even greater 
similarity to the expressed plant sequence that we iden- 
tified. The database sequence contains several differ- 
ences from the plant RNA we describe here. None of 
these differences have been seen in the 31 cDNA clones 
we have sequenced. This suggests that this sequence 
is not expressed at an appreciable level in Arabidopsis 
and thus it most likely corresponds to a U12 snRNA 
pseudogene. 

Functional testing of nonconserved 
elements of U6atac snRNA 

To determine if some of the differences noted above be- 
tween plant and human U6atac snRNAs were function- 
ally silent, we tested them in the context of an in vivo 
suppression assay for U6atac snRNA. We have previ- 
ously shown that human U6atac snRNA compensatory 
mutants expressed in Chinese hamster ovary (CHO) 
cells can suppress the in vivo cryptic splicing pheno- 
type of 5' splice site mutants of a U1 2-dependent intron 
(Incorvaia & Padgett, 1998). Starting with this suppres- 
sor snRNA, we tested the effect of the A-to-G difference 
at position 26 of plant U6atac. This position appears 
to be homologous to the invariant A of the AGC motif 
found in U6 snRNAs (Brow & Guthrie, 1988) and in hu- 
man U6atac snRNAs (Tarn & Steitz, 1 996b). In yeast U6 
snRNA, mutation of this position leads to defects in splic- 
ing, particularly in the second step (Fabrizio & Abelson, 
1990; Madhani et al., 1990). However, mutation of this 
residue in mammalian U6 snRNA had no effect using an 
in vivo suppression assay (Datta & Weiner, 1 993). When 
theA26-to-G mutation was introduced into the suppres- 
sor U6atac construct (Figs. 8A and 9), full in vivo sup- 
pressor activity was maintained showing that G 26 is fully 
compatible with active U 1 2-dependent splicing. Note that 
both A 26 and G 26 can potentially base pair to U12 
snRNA in slightly different registers (Figs. 7 and 8A). 

Immediately following this (A/G)GC sequence is a 
region that can form an intramolecular stem-loop that is 
similar in size, position, and structure to a critical region 
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of U6 snRNA. Although the plant and human U6atac 
sequences differ by almost 50% in this region, the 
predicted structures are similar. To demonstrate that 
the plant stem-loop could still be active in spite of 
these differences, we constructed a chimeric U6atac 
in which the plant stem-loop replaced the human stem- 
loop. The resulting construct was tested for activity in 
vivo using the same suppressor assay described 
above. We found that, in the presence of a mutated 
human U4atac snRNA, this chimeric U6atac snRNA 
was active in vivo (Figs. 8A and 9). This shows that 
the function of the plant intramolecular stem-loop struc- 
ture is conserved. 

These data also provide the first in vivo evidence 
for the predicted function of U4atac snRNA in In- 
dependent splicing. When Tarn and Steitz (1996b) 
identified U6atac and U4atac snRNAs in human nu- 
clear extracts, they noted that the two snRNAs could 
adopt a base-paired structure analogous to that formed 
by U4 and U6 snRNAs. In the case of U4/U6 snRNA, 
this structure appears to be required for splicing in 
vivo and in vitro (Wolff & Bindereif, 1992). The pre- 
cise role of this structure is still unclear but it has 
been proposed that U4 snRNA acts as a chaperone 
to deliver the U6 snRNA to the nascent spliceosome 
in an inactive form (Guthrie & Patterson, 1988). Sub- 
sequently, through the action of ATP-dependent heli- 
cases (Raghunathan & Guthrie, 1998), the two snRNAs 
are separated and U6 goes on to form alternative 
base-pairing interactions with U2 and the intron 5' 
splice site whereas U4 appears to be destabilized 
from the spliceosome (Lamond et al., 1988; Yean & 
Lin, 1991; reviewed in Nilsen, 1998). 

The provocative potential sequence complementari- 
ties between U4atac and U6atac, in addition to their 
copurification, led to the proposal that they participated 
in an analogous interaction (Tarn & Steitz, 1996b). Our 
initial experiments with the chimeric human/plant U6atac 
snRNA showed that it was inactive in the in vivo sup- 
pression assay. Inspection of the sequence showed 
that the altered region of the chimeric snRNA also cor- 
responded to the region proposed to form many of the 
base pairs to U4atac snRNA. The plant sequence dif- 
fers substantially in this region and would be expected 
to destabilize the interaction with U4atac. To test if this 
was the cause of the failure of the chimeric U6atac 
snRNA to suppress, we constructed a human U4atac 
snRNA expression gene and made compensatory mu- 
tations in the 5' region to restore the same base-paired 
structure as in human U4atac and U6atac snRNAs 
(Fig. 8B). Cotransfection of this compensatory U4atac 
expression construct led to suppression of the intron 5' 
splice site mutation in a manner that required both the 
chimeric U6atac and the compensatory U4atac con- 
structs (Fig. 9). This demonstrates that there is an in 
vivo requirement for base pairing between U4atac and 
U6atac snRNAs. 



Conserved features of plant U6atac 
and U12 snRNAs 

The evolutionary distance between plants and verte- 
brates has permitted numerous changes to accumu- 
late in their respective snRNAs. This is particularly 
noteworthy in the case of U6atac snRNA. The Arabi- 
dopsis and human homologs of this snRNA are signif- 
icantly more divergent than are U6 snRNAs from the 
same species (65% for U6atac versus 85% for U6). 
The source of this difference is unknown at present. A 
possibility is that there are only one or a few active 
genes for U6atac. It has been reported that the human 
U1 2 snRNA gene is single copy (Tarn et al., 1995). The 
precise number of U6atac genes that exist in any or- 
ganism is unknown at present, but preliminary South- 
ern analyses of Arabidopsis genomic DNA suggest that 
there are only a small number of U6atac genes (data 
not shown). If these snRNAs are expressed from single- 
copy genes, it would reduce the potential for sequence 
"homogenization" seen in multigene families and so 
speed the accumulation of functionally silent changes. 

The pattern of conservation of U6atac sequences cor- 
responds well to the predicted functional regions of 
the molecule. Of particular note is the conservation 
of sequence and the potential for base pairing in the 
U6atac-U12 helix la/lb regions. Biochemical crosslink- 
ing experiments suggested that these regions were jux- 
taposed (Tarn & Steitz, 1 996b). The conservation of this 
complementarity in plants, including a clear case of a 
base pair in helix la being preserved through compen- 
satory base alterations, strengthens this view. The con- 
servation of the nominally unpaired bases in U12 snRNA 
in this region suggests that they may have additional 
roles in the U12-dependent spliceosome. Similarly, 
the sequence differences seen in the intramolecular 
stem-loop of U6atac are restricted almost entirely to nu- 
cleotides which appear to be engaged in base-pairing 
interactions. With one exception, all of the putative un- 
paired bases are conserved between plants and hu- 
mans. A strong but not complete correspondence of 
these bases between U6atac and U6 snRNAs has been 
noted (Tarn & Steitz, 1997). Whether this similarity is 
because of a need to interact with common proteins or 
to form a catalytic RNA structure or both is unclear at 
present. In contrast, the potential to form a U6atac-U12 
interaction analogous to the U2-U6 helix III (Sun & Man- 
ley, 1995) is not conserved in the plant homologs. Both 
snRNAs differ from the human sequences in this region, 
with no clear evidence of compensatory changes sug- 
gestive of a base-pairing interaction. 

MATERIALS AND METHODS 

RT-PCR cloning of Arabidopsis U6atac snRNA 

A candidate Arabidopsis U6atac snRNA gene was identified 
by a BLAST search of available sequences using the first 
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40 nt of human U6atac snRNA. This identified a single high- 
scoring entry (Accession #AB006702). To determine if this 
putative snRNA was expressed, first-strand cDNA was re- 
verse transcribed from 0.5 fxg of total Arabidopsis RNA using 
Tth polymerase (Perkin-Elmer). The reaction was primed with 
the antisense primer (CACGAATGTGGAGAGCCTTAAC) 
spanning the region between 80 and 101 of the putative plant 
U6atac snRNA gene sequence at 60 °C for 15 min. For am- 
plification of the U6atac cDNA, a sense primer (GTGTTCG 
TAGAAAGGAGAGATGG) spanning bases 1-23 of the 
putative U6atac snRNA was used together with the antisense 
primer above. The cDNA was amplified for 1 min at 95 °C, 
1 min at 60 °C, and 1 min at 72 °C for 40 cycles followed by 
7 min at 72 °C. 

RT-PCR of Arabidopsis U12 snRNA 

Total Arabidopsis RNA was fractionated on a 10% denaturing 
polyacrylamide gel and RNAs between 100 and 200 nt were 
isolated by the "Crush-Soak" procedure (Sambrook et al., 
1989). The eluted RNA was extracted with phenol and chlo- 
roform, ethanol precipitated, washed, and dissolved in water 
The RNA (0.2 jug) was 3' polyadenylated using yeast poly (A) 
polymerase (Gibco/BRL) and 2 mM ATP in the supplied buffer 
for 1 0 min at 37 °C in a total volume of 25 fiL. First-strand cDNA 
was synthesized from 1/10 of the poly (A)-tailed RNA by re- 
verse transcription using an oligo d (T) primer and Superscript 
reverse transcriptase in the supplied buffer and 0.5 mM of each 
dNTP at 42°C for 1 h in a 20-^L reaction. The U12 cDNAwas 
amplified using a primer (ATGCCTTAAACTTATGAGTAAGGA) 
derived from human U1 2 snRNA spanning bases 1 to 24. Am- 
plification parameters were 1 min at 95 °C, 1 min at 56 °C, and 
1 min at 72°C for 40 cycles followed by 7 min at 72 °C. 

Characterizing the ends of U6atac 
and U12 snRNAs by RACE 

For U12 snRNA, the 3' end sequence was determined from 
amplification of poly (A)-tailed RNA using oligo dT and a 
human U12-specific 5' primer. Twenty-five clones of this prod- 
uct were sequenced. The true 5' end sequence was deter- 
mined using the SMART PGR cDNA kit (Clontech) to attach 
a primer to the 3' end of the cDNA, followed by PCR ampli- 
fication using a primer complementary to this sequence and 
an internal U12-specific primer. The amplified DNA was gel 
purified, cloned, and sequenced. 

For U6atac snRNA, the 3' end sequence was determined 
by amplifying the poly (A)-taiied RNA using the 5' U6atac 
primer (nt 1-23) and a modified oligo dT primer (CDS/3' 
primer, Clontech). The amplified DNA was gel purified, cloned, 
and sequenced. The 5' end sequence was determined from 
cDNA synthesized from the poly (A)-tailed RNA using the 
CDS/3' primer that was 3' tailed using dCTP and terminal 
deoxynucleotidyl transferase. The cDNA was amplified with 
the abridged anchor primer above and a primer complemen- 
tary to nt 80-101 of U6atac snRNA. The amplified DNA was 
then gel purified, cloned, and sequenced. 

Construction of U4atac expression plasmid 

The U4atac expression plasmid was generated by the same 
method used previously for U11 and U12 snRNAs. Briefly, 



the U1 snRNA coding region of a functional U1 gene was 
replaced by PCR techniques with the coding region of 
U4atac snRNA amplified from a U4atac plasmid obtained 
from J. Steitz. Sequence analysis of this plasmid showed 
that it was missing the 3'-most 7 nt of the published U4atac 
sequence and contained a sequence alteration at residues 
60 and 61. This sequence alteration leads to a CG-to-GC 
inversion at these residues when compared to the pub- 
lished sequence (Tarn & Steitz, 1996b) and appears to rep- 
resent an error in the original sequence determination (J. 
Steitz, pers. comm.). For the expression plasmid described 
here, the final 7 nt were supplied by the PCR primer, but 
the inversion at nt 60-61 was left unchanged. The wild- 
type U4atac primers were GGCCAGATCTCAACCATCCT 
TTTCTTGGGGT (5' primer) and ( 3' primer) CCGG 
GTCGACGGTCTGTTTTTGAAACTCCAGAAAGTCTATTTTT 
CCAAAAATTGCAC. The mutant 5' primer was GGCC 
AGATCTCAACCCGTCTCTTTCTTAGGATTGCGCTACTGTC. 
These produced PCR products containing either human wild- 
type U4atac or human U4atac containing compensatory mu- 
tations to restore base pairing with the chimeric U6atac 
snRNA. The PCR fragments were digested with Sail and 
BglW restriction enzymes and ligated into a U1 expression 
vector from which the U1 coding region had been excised 
(Bond et al., 1991). The sequences of the mutant and wild- 
type snRNAs were confirmed by DNA sequencing. 

Construction of U6atac mutants 

The U6atac snRNA mutants were made in the expression 
plasmid previously described (Incorvaia & Padgett, 1998) 
using the Altered Sites II system from Promega and single 
mutagenic oligonucleotides. The sequence of the primer used 
for synthesis of the U6atac chimeric snRNA construct was 
GAAGGTTAGCATCTCCTCTGACAGAGACGGGAGAGGC 
CCTC. All mutations were confirmed by DNA sequencing. 

Analysis of in vivo splicing 

Transient transfection of the P120 minigene and snRNA ex- 
pression plasmids into cultured CHO cells was as described 
(Hall & Padgett, 1996; Kolossova & Padgett, 1997; Incorvaia 
& Padgett, 1998). For these experiments, 0.5 fig of P120 
plasmid and 5 of each of the snRNA expression plasmids 
were added to 1 x 10 6 cells. Where one or more snRNA 
plasmids were omitted, a corresponding amount of pUC19 
plasmid DNA was substituted. Total RNA was isolated from 
cells 48 h after transfection, reverse transcribed, and PCR 
amplified as described (Kolossova & Padgett, 1997; Incor- 
vaia & Padgett, 1998). The products were analyzed by aga- 
rose gel electrophoresis. The DNA bands were visualized 
using ethidium bromide and photographed using a digital video 
camera (Kodak). Independent transfections and analyses gave 
substantially similar results. 

Northern blot analysis 

Ten micrograms of total cellular Arabidopsis RNA was frac- 
tionated on a 8% polyacrylamide-urea gel and electroblotted 
to a nylon membrane. Prehybridization was carried out at 
60°C for 2 h in 6x SSC, 5x Denhardt's solution, 0.1% SDS, 
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and 0.1 mg/mL of sonicated salmon sperm DNA. Probes 
were prepared by random priming of cDNA plasmids using 
32 P-dCTP and hybridized to the blots in the above solution at 
a concentration of 10 4 cpm/mL Hybridizations were per- 
formed at 60 °C for 16 h. The membranes were washed at 
room temperature for 15 min in 2x SSC and then at 60 °C in 
2x SSC and 0.05% SDS for 20 min. 
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